Non-Deterministic Policies In Markovian Processes

نویسنده

  • Mahdi Milani Fard
چکیده

Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct adaptive treatment strategies, where a sequence of individualized treatments is learned from clinical data. Although these methods have proved to be useful in problems concerning sequential decision making, they cannot be applied in their current form to medical domains, as they lack widely accepted notions of confidence measures. Moreover, policies provided by most methods in reinforcement learning are often highly prescriptive and leave little room for the doctor’s input. Without the ability to provide flexible guidelines and statistical guarantees, it is unlikely that these methods can gain ground within the medical community. This thesis introduces the new concept of non-deterministic policies to capture the user’s decision making process. We use this concept to provide flexible choice to user among near-optimal solutions, and provide statistical guarantees for decisions with uncertainties. We provide two algorithms to propose flexible options to the user, while making sure the performance is always close to optimal. We then show how to provide confidence measures over the value function of Markovian processes, and finally use them to find sets of actions that will almost surly include the optimal one.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-Deterministic Policies in Markovian Decision Processes

Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional meth...

متن کامل

Learning in a state of confusion : employing active perception and reinforcement learning in partially observable worlds

In applying reinforcement learning to agents acting in the real world we are often faced with tasks that are non-Markovian in nature. Much work has been done using state estimation algorithms to try to uncover Markovian models of tasks in order to allow the learning of optimal solutions using reinforcement learning. Unfortunately these algorithms which attempt to simultaneously learn a Markov m...

متن کامل

Dynamic Pricing and Inventory Control: the Value of Demand Learning

This paper studies various approaches to demand learning in the context of a one-shot inventory replenishment problem with dynamic pricing. The customer arrival process is assumed to be piecewise deterministic and Markovian with an unknown parameter. Homogeneous customers have an iso-elastic demand function and do not behave strategically. We study full information, non-learning, passive learni...

متن کامل

Stationarity, Time–reversal and Fluctuation Theory for a Class of Piecewise Deterministic Markov Processes

We consider a class of stochastic dynamical systems, called piecewise deterministic Markov processes, with states (x, σ) ∈ Ω × Γ, Ω being a region in R or the d–dimensional torus, Γ being a finite set. The continuous variable x follows a piecewise deterministic dynamics, the discrete variable σ evolves by a stochastic jump dynamics and the two resulting evolutions are fully–coupled. We study st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009